Faster Beam-Search Decoding for Phrasal Statistical Machine Translation
نویسندگان
چکیده
Pharaoh is a widely-used state-of-the-art decoder for phrasal statistical machine translation. In this paper, we present two modifications to the algorithm used by Pharaoh that together permit much faster decoding without losing translation quality as measured by BLEU score. The first modification improves the estimated translation model score used by Pharaoh to evaluate partial hypotheses, by incorporating an estimate of the distortion penalty to be incurred in translating the rest of the sentence. The second modification uses early pruning of possible next-phrase translations to cut down the overall size of the search space. These modifications enable decoding speed-ups of an order of magnitude or more, with no reduction in the BLEU score of the resulting translations.
منابع مشابه
Incremental Decoding for Phrase-Based Statistical Machine Translation
In this paper we focus on the incremental decoding for a statistical phrase-based machine translation system. In incremental decoding, translations are generated incrementally for every word typed by a user, instead of waiting for the entire sentence as input. We introduce a novel modification to the beam-search decoding algorithm for phrase-based MT to address this issue, aimed at efficient co...
متن کاملOptimal Beam Search for Machine Translation
Beam search is a fast and empirically effective method for translation decoding, but it lacks formal guarantees about search error. We develop a new decoding algorithm that combines the speed of beam search with the optimal certificate property of Lagrangian relaxation, and apply it to phraseand syntax-based translation decoding. The new method is efficient, utilizes standard MT algorithms, and...
متن کاملForest Rescoring: Faster Decoding with Integrated Language Models
Efficient decoding has been a fundamental problem in machine translation, especially with an integrated language model which is essential for achieving good translation quality. We develop faster approaches for this problem based on k-best parsing algorithms and demonstrate their effectiveness on both phrase-based and syntax-based MT systems. In both cases, our methods achieve significant speed...
متن کاملLater-stage Minimum Bayes-Risk Decoding for Neural Machine Translation
For extended periods of time, sequence generation models rely on beam search as the decoding algorithm. However, the performance of beam search degrades when the model is over-confident about a suboptimal prediction. In this work, we enhance beam search by performing minimum Bayes-risk (MBR) decoding for some extra steps at a later stage. In our experiments, we found that the conventional MBR r...
متن کاملSharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU
Attentional sequence-to-sequence models have become the new standard for machine translation, but one challenge of such models is a significant increase in training and decoding cost compared to phrase-based systems. Here, we focus on efficient decoding, with a goal of achieving accuracy close the state-of-the-art in neural machine translation (NMT), while achieving CPU decoding speed/throughpu...
متن کامل